Search CORE

51 research outputs found

Language Resources – a Part of World Cultural Heritage

Author: Dimitrova Ludmila
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/2011
Field of study

This article briefly reviews multilingual language resources for Bulgarian, developed in the frame of some international projects: the first-ever annotated Bulgarian MTE digital lexical resources, Bulgarian-Polish corpus, Bulgarian-Slovak parallel and aligned corpus, and Bulgarian-Polish-Lithuanian corpus. These resources are valuable multilingual dataset for language engineering research and development for Bulgarian language. The multilingual corpora are large repositories of language data with an important role in preserving and supporting the world's cultural heritage, because the natural language is an outstanding part of the human cultural values and collective memory, and a bridge between cultures.*1 EU COP Project 106 MULTEXT-East Multilingual Text Tools and Corpora for Central and Eastern European Languages *2 Semantics and Contrastive Linguistics with a Focus on a Bilingual Electronic Dictionar

Bulgarian Digital Mathematics Library at IMI-BAS

Lexicographic Tools and Techniques.

Author: Dimitrova Ludmila
Pavlov Radoslav
Publication venue: Institute for Information Transmission Problems - Russian Academy of Sciences
Publication date: 01/01/2008
Field of study

We describe in brief what grid technologies are and how they could contribute to the language technologies, in particular lexicographic activities. Based on our participation in the EC international project MULTEXT-East, we present some aspects of language resource compatibility: unification and standardisation. We underline the importance of the developed harmonised lexical (morphosyntactic) specifications and descriptions of language data in machine-readable form in a common standard encoding format – Corpus Encoding Standard format – for six Central and East European (CEE) languages, as well as the language-independence of the tools employed.The study and preparation of these results have received funding from the EC's Seventh Framework Programme [FP7/2007-2013] under grant agreement 211938 MONDILEX

Bulgarian OpenAIRE Repository

Bilingual Corpus - Digital Repository for Preservation of Language Heritage

Author: Dimitrova Ludmila
Garabík Radovan
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/2012
Field of study

The article briefly reviews bilingual Slovak-Bulgarian/Bulgarian-Slovak parallel and aligned corpus. The corpus is collected and developed as results of the collaboration in the frameworks of the joint research project between Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, and Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences. The multilingual corpora are large repositories of language data with an important role in preserving and supporting the world's cultural heritage, because the natural language is an outstanding part of the human cultural values and collective memory, and a bridge between cultures. This bilingual corpus will be widely applicable to the contrastive studies of the both Slavic languages, will also be useful resource for language engineering research and development, especially in machine translation

Bulgarian Digital Mathematics Library at IMI-BAS

Multilingual digital resources with Bulgarian language

Author: Dimitrova Ludmila
Publication venue: 'Institute of Slavic Studies Polish Academy of Sciences'
Publication date: 01/11/2015
Field of study

Multilingual digital resources with Bulgarian languageThe paper presents in brief Bulgarian language resources as a part of multilingual digital resources developed in the frame of some international projects, among them parallel annotated and aligned corpora, comparable corpora, morpho-syntactic specifications for corpora annotation and dictionaries encoding, lexicons, lexical databases, and electronic dictionaries

Directory of Open Access Journals

Information Technologies for the Preservation of Language Heritage

Author: Dimitrova Ludmila
Dutsova Ralitsa
Panova Rumiana
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/2011
Field of study

In this paper we try to present how information technologies as tools for the creation of digital bilingual dictionaries can help the preservation of natural languages. Natural languages are an outstanding part of human cultural values and for that reason they should be preserved as part of the world cultural heritage. We describe our work on the bilingual lexical database supporting the Bulgarian-Polish Online dictionary. The main software tools for the web- presentation of the dictionary are shortly described. We focus our special attention on the presentation of verbs, the richest from a specific characteristics viewpoint linguistic category in Bulgarian

Bulgarian Digital Mathematics Library at IMI-BAS

Trilingual aligned corpus – current state and new applications

Author: Danuta Roszko
Ludmila Dimitrova
Roman Roszko
Violetta Koseska
Publication venue: 'Institute of Slavic Studies Polish Academy of Sciences'
Publication date: 01/01/2014
Field of study

Trilingual aligned corpus – current state and new applications This article describes current state of a trilingual parallel corpus consisted of texts in two Slavic (Bulgarian and Polish) and one Baltic language (Lithuanian). The corpus contains original literary texts (fiction, novels, and short stories) in one of the three languages with translations to the other two, and texts in other languages translated into Bulgarian, Polish, and Lithuanian. A part of the texts are aligned at the sentence level. The authors propose a semantic annotation of verbs appearing in these aligned texts that will facilitate contrastive studies of natural languages. A theoretical background for the proposed semantic annotation is briefly also discussed

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Directory of Open Access Journals

Extraction and Presentation of Bilingual Correspondences from Slovak-Bulgarian Parallel Corpus

Author: Dimitrova Ludmila
Garabík Radovan
Publication venue: 'Institute of Slavic Studies Polish Academy of Sciences'
Publication date: 01/01/2015
Field of study

Extraction and Presentation of Bilingual Correspondences from Slovak-Bulgarian Parallel CorpusIn this paper the results of the automatic extraction and presentation of bilingual correspondences from Slovak-Bulgarian Parallel corpus are described. The equivalent phrases are extracted from sentence and word level automatically aligned corpus, filtered, indexed and presented in a dictionary-like interface. The bilingual dictionary database contains 80 thousand phrase pairs consisting of approximately 350 thousand words (per each language). Counting unique word forms, the size is 31 thousand in the Slovak part of the dictionary, 26 thousand in the Bulgarian part

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Directory of Open Access Journals

Translation equivalence of demonstrative pronouns in Bulgarian-Slovak parallel texts

Author: Dimitrova Ludmila
Garabík Radovan
Publication venue: 'Institute of Slavic Studies Polish Academy of Sciences'
Publication date: 01/01/2014
Field of study

Translation equivalence of demonstrative pronouns in Bulgarian-Slovak parallel textsIn this paper we describe our automatic analysis of several parallel Bulgarian-Slovak texts with the goal to obtain useful information about Slovak translation equivalents of (definite) articles and demonstrative pronouns in Bulgarian. Rather than focusing on individual translation equivalents, we present a method for automatic extraction and visualization of the translations. This can serve as a guide for pinpointing interesting features in specific translated documents and could be extended for other parts of speech or otherwise identifiable textual units

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Directory of Open Access Journals

Presentation of the verbs in Bulgarian-Polish electronic dictionary

Author: Ludmila Dimitrova
Violetta Koseska
Publication venue: 'Institute of Slavic Studies Polish Academy of Sciences'
Publication date: 01/01/2014
Field of study

Presentation of the verbs in Bulgarian-Polish electronic dictionary This paper briefly discusses the presentation of the verbs in the first electronic Bulgarian-Polish dictionary that is currently being developed under a bilateral collaboration between IMI-BAS and ISS-PAS. Special attention is given to the digital entry classifiers that describe Bulgarian and Polish verbs. Problems related to the correspondence between natural language phenomena and their presentations are discussed. Some examples illustrate the different types of dictionary entries for verbs

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Directory of Open Access Journals

Implementation of the Bulgarian-Polish online dictionary

Author: Dimitrova Ludmila
Dutsova Ralitsa
Publication venue: 'Institute of Slavic Studies Polish Academy of Sciences'
Publication date: 01/11/2015
Field of study

Implementation of the Bulgarian-Polish online dictionaryThe paper describes the implementation of an online Bulgarian-Polish dictionary as a technological tool for applications in digital humanities. This bilingual digital dictionary is developed in the frame of the joint research project “Semantics and Contrastive Linguistics with a focus on a bilingual electronic dictionary” between IMI-BAS and ISS-PAS, supervised by L. Dimitrova (IMI-BAS) and V. Koseska-Toszewa (ISS-PAS). In addition, the main software tools for web-presentation of the dictionary are described briefly

Directory of Open Access Journals